Random variable, stochastic process and simulation.
Contents
Random variable, stochastic process and simulation.¶
import numpy as np
import scipy
from numpy.random import normal, choice, uniform
import ipywidgets as widgets
import matplotlib.pyplot as plt
plt.style.use('fast')
%matplotlib inline
Random numbers, probability distribution and stat analysis¶
The numpy.random has the fastest random number generators that are based on low level code written in C.
The Scipy.stats has an extensive library of statistical distributions and tools for statistical analysis.
The Statsmodels Enhancing Scipy functionality with more tools
The Seaborn library that enhances matplotlib functionality for stat visualization.
General overview of random numbers in python¶
First we take a look at most widely used random numbers of numpy also called standard random numbers. These are rand (uniform random number on interval 0,1) and randn (stnadard normal random number with 0 mean and 1 variance).
When running code that uses random numbers results will always be different for every run. If you want code to reproduce same result you can fix the seed to get reproducible results:
np.random.seed(8376743)
# Generates standard uniform random numbers U(0,1)
r = np.random.rand(10)
r
array([0.75837541, 0.70300344, 0.39583868, 0.01087556, 0.44637723,
0.2457634 , 0.00973097, 0.54637 , 0.4948939 , 0.95468676])
def rnplot(r):
'''Convenience function for making quick two panel plot showing
a line plot for the sequece of random numbers (RN)
a histogram plot of probability density of random numbers
'''
fig, ax = plt.subplots(ncols=2)
ax[0].plot(r, color='blue', label='trajectory')
ax[1].hist(r, density=True, color='red', label = 'histogram')
ax[0].set_xlabel('Samples of RN')
ax[0].set_ylabel('Values of RN')
ax[1].set_xlabel('Values of RN')
ax[1].set_ylabel('Probability Density')
fig.legend();
fig.tight_layout()
r = np.random.rand(200)
rnplot(r)
In the same way we generate and visualize norally distributed random numbers, \(N(0,1)\)
r = np.random.randn(200)
rnplot(r)
We showed examples of standard random numbers \(U(0,1)\) and \(N(0,1)\)
Parametraized random number allow you to set parameters like value of mean and interval length and therefore can be viewed as generalized versions of standard random numbers. Below we take a look at examples of continuous (random and unofrm RVs) and discrete random numbers generated by paratmerized distributions.
r = np.random.uniform(low=-1, high=1, size=(5, 200))
print(r.shape)
rnplot(r[0,:])
(5, 200)
r=np.random.normal(loc=8, scale=10, size=(2, 2000))
print(r.shape)
rnplot(r[0,:])
(2, 2000)
r = np.random.binomial(n=10, p=0.6, size=2000)
print(r.shape)
rnplot(r)
(2000,)
Using random numbers to get answers via simulations¶
One of the major uses of random numbers is for conducting numerical simulations. What is a simulation? It is a recreation of a process on a computer. And this recreation is done by random numbers. E.g to simulate coint tosses, die throws, diffusion of molecules, conformational change of polymers we use random number to recreate the process on a computer. Let’s start off by asking some simple questions
How often do we get a run of 5 or more consecutive heads in 100 coin tosses if we repeat the experiment 1000 times?
What if the coin is biased to generate heads only 40% of the time?
We will use np.random.choice which generates random samples from limited options.
np.random.choice(a, size, p, replace=True)
a:an array/list of choices like,[-1, 1]or[1, 2, 3]or['bagel', 'muffin']p:probabilities of picking choices from a. E.gp=[0.5, 0.5]. Default is equal probabilities.size:shape of an array could beN, (N,M), (N,M,Q)etcreplace=True.If you choose bagel from a, the caffee will put back a new one on shelf.
L = 100 # length of each trajectory
N = 1 # number of experiments: stochastic trajecotries generated
xs = np.random.choice([0,1], size=(L, N)) # (i) Unbiased coin p=[0.5,0.5] by default
ys = np.random.choice([0,1], size=(L, N), p=[0.9, 0.1]) # (ii) biased coin
fig, ax = plt.subplots(nrows=2)
r = np.random.randn(200)
ax[0].plot(xs, 'o', alpha=0.5, label = 'data1')
ax[0].plot(ys, 'o', alpha=0.5, label = 'data2')
ax[1].hist((xs[:,0], ys[:,0]), 2, label = ("data1", "data2"))
fig.legend();
Simulating a 1D unbiased random walk¶
In the course of simulating random walks we will be genreating multidimensional numpy arrays. We will adhere to a convnetion that:
Rows are regarded as number of measurments, or samples
Columns are regarded as number of observables distinct measruments/trajectories
We then take cumulative sum over trajectory [a,b,c,…], which accumulates random walker’s position over time [a, a+b, a+b+c,…]. This is done by convenient
np.cumsum()method.
def rw_1d(L, N):
'''
L: trajectory length
N: Number of trajecotry
returns np.array (L, N) shape
'''
# Create random walks
r = choice([-1,1], size=(L, N))
#Accumulate position
rw = r.cumsum(axis=0)
#Set initial position
rw[0,:]=0
return rw
rw = rw_1d(2000, 1000)
print(rw.shape)
L = rw.shape[0]
N = rw.shape[1]
(2000, 1000)
# Simulate 1D random walk
Tmax = 1000
N = 1000
rw = rw_1d(Tmax, N)
@widgets.interact(t=(1, Tmax-1))
def rw_plotter(t=1):
fig, ax = plt.subplots(nrows=2)
ax[0].plot(rw)
ax[0].axvline(x=t, color='black', linestyle='-', lw=3)
ax[1].hist(rw[t, :], density=True)
## Plot gaussian with width t**0.5
x = np.linspace(-100,100, 1000)
y = scipy.stats.norm.pdf(x, 0, np.sqrt(t))
ax[1].plot(x,y)
ax[0].set_ylabel('Position')
ax[0].set_title('RW trajectries');
ax[1].set_xlabel('Position')
ax[1].set_ylabel('Histogram')
ax[1].set_xlim([-100, 100])
fig.tight_layout()
Connection with diffusion: Mean square displacement¶
L, N = 2000, 1000
rw = rw_1d(L, N)
t = np.arange(L)
msd = np.mean(rw**2, axis=1)
rsd = msd**0.5
plt.loglog(np.arange(L), rsd, lw=3)
plt.loglog(t, np.sqrt(t), '--')
plt.title('Compute mean square deviation of 1D random walker',fontsize=15)
plt.xlabel('Number of steps, n',fontsize=15)
plt.ylabel(r'$MSD(n)=\langle x(n)^2 \rangle^{1/2}$',fontsize=15);
Questions
Compute MSD for 1D random walk show the expected scaling
Generate random walks from different positions
choice between [-1,1]
Multinomial choice
Self avoiding walks
def rw_2d(L, N):
'''2d random walk function:
L: trajectory length
N: Number of trajecotry
returns np.array with shape (L, N)
'''
verteces = np.array([(1, 0),
(0, 1),
(-1, 0),
(0, -1)])
rw = verteces[choice([0,1,2,3], size=(L, N))]
rw[0, :, :] = 0
return rw.cumsum(axis=0)
traj = rw_2d(L=10000, N=1000)
print(traj.shape)
(10000, 1000, 2)
Compute room mean square distance of a random walker
fig, ax = plt.subplots(nrows=2, figsize=(10,10))
#Simulate 2D random walk
L, N = 10000, 1000
traj = rw_2d(L, N)
#Compute RSD
t = np.arange(L)
r2 = np.sum(traj**2, axis=2)
mean_r2 = np.mean(r2, axis = 1)
rsd = np.sqrt(mean_r2)
ax[0].plot(traj[:3000, :5, 0], traj[:3000, :5, 1]);
ax[1].loglog(t, rsd, lw=3, alpha=0.5);
ax[1].loglog(t, t**0.5, '--');
ax[0].set_title('2D random walker',fontsize=15)
ax[0].set_xlabel('X')
ax[1].set_xlabel('Y')
ax[1].set_xlabel('Number of steps, t',fontsize=15)
ax[1].set_ylabel(r'$MSD(n)=\langle R(t)^2 \rangle^{1/2}$',fontsize=15);
Binomial distribution of a random walker¶
In previous example we started with random variables with no reference to probability distribution. This time we will generate random numbers from a binomial distribution.
This time we will use scipy.stats library which contains probability distirbution functions. That is in addition to generating random variables we can also compute probability distributions and related quantities analytically.
from scipy.stats import binom, norm, poisson
s = binom(10, 0.5) # Let us declare s to be a binomial RV
print(s.rvs(20)) # 20 random samples form X
print(s.pmf(5)) # P(X = 5)
print(s.cdf(5)) # P(X <= 5)
print(s.mean()) # E[X], mean
print(s.var()) # Var(X), variance
print(s.std()) # Std(X), standard deviation
[5 8 3 4 5 2 8 4 3 3 4 7 5 7 6 5 4 3 5 4]
0.24609375000000003
0.623046875
5.0
2.5
1.5811388300841898
def coin_flip(L,p,N):
'''
L: flip coint L times
q: with p probability
N: Repeat experiment N times
'''
coin = binom(L, p) # Binomial RV
coin_flips = coin.rvs(N) # Generate sample of N points
coin_pmf = coin.pmf(np.arange(L+1)) # Generate PMF from 0 to L
coin_cdf = coin.cdf(np.arange(L+1))
return coin_flips, coin_pmf, coin_cdf
fig, ax=plt.subplots(nrows=1, ncols=3,figsize=(12,3))
#Coin flip 1 time.
X1, P1, CP1 = coin_flip(20, 0.5, 80)
ax[0].plot(X1,'-',color='grey')
ax[1].hist(X1,density=True)
ax[1].plot(P1,'-o')
ax[2].plot(CP1,'-d',color='red')
[<matplotlib.lines.Line2D at 0x7f7420cad790>]
Continuous time random walk: Brownian motion¶
We have learned about random variables, random walk and have encorunetred a concept of stochastic process on the example of discrete step 1D random walk. Now let us generate the prototypical stochasic process in continuous time; the brownian motion. Brownian motion was first discoverd by a botanist Brown who noticed that a pollen in solution undergo erratic and incessant motion. To simulate brownian motion we take the continuous time limit of random walk and approximate dsiplacements of our particle as normally distributed (binomial->normal, time step->continuous time)
We assume we have started at position \(\mu=0\) and our variance is given by \(\sigma^2=2Dt\) Where D is the diffusion coefficnets which is related to parameters of discree random walk as shown in the lecture.
In the last step we re-wrote brownian motion equation in a convenient way by shifting normally distributed radnom variable by \(\mu\) and scaling it by \(\sigma\)
def brown(T, N, dt=1, D=1):
"""
Creates 3D brownian path given:
time T
N=1 trajecotires
dt=1 timestep
D=1 diffusion coeff
returns np.array with shape (N, T, 3)
"""
nT = int(T/dt) # how many points to sample
dR = np.sqrt(2*D*dt) * np.random.randn(N, nT, 3) # 3D position of brownian particle
R = np.cumsum(dR, axis=1) # accumulated 3D position of brownian particle
return R
Brownian motion¶
Below we proceed to simulate continuous random walk in 1D-3D
We will plot trajecotires and distributions of brownian particle using interactive plotting via ipywidgets and holoviews/plotly interface.
R=brown(T=3000, N=1000)
print(R.shape)
(1000, 3000, 3)
@widgets.interact(t=(10,3000-1))
def brownian_plot(t=10):
fig, ax = plt.subplots(ncols=2)
ax[0].plot(R[:5, :t, 0].T, R[:5, :t, 1].T);
ax[1].hist(R[:, 10, 0], density=True, color='red')
ax[1].hist(R[:, t, 0], density=True)
ax[1].set_ylim([0,0.1])
ax[0].set_ylim([-200, 200])
ax[0].set_xlim([-200, 200])
fig.tight_layout()
import holoviews as hv
hv.extension('plotly')
plots = []
for i in range(10):
plot = hv.Path3D(R[i,:,:], label='3D random walk').opts(width=600, height=600, line_width=5)
plots.append(plot)
hv.Overlay(plots)
rw_curve = [hv.Curve((R[i,:,0], R[i,:,1])) for i in range(10)]
xdist = hv.Distribution(R[:,10,0], ['X'], ['P(X)'])
ydist = hv.Distribution(R[:,10,1], ['Y'], ['P(Y)'])
hv.Overlay(rw_curve) << ydist << xdist
Diffusion Equation¶
The movement of individual random walkers \(\leftrightarrow\) density of walkers \(\rho(\vec{r},t)\)
Diffusion equation:
Formulated empirically as Fick’s laws
This is a 2nd order PDE! Unlike equations of motion diff eq shows irreersibile behaviour
This one exactly solvable. But in general reaction-diffusion PDEs difficult to solve analytically.
Can solve numerically by writing derivatives as finite differences!
Can also simulate via random walk!
Diffusion coefficient \(D\), Units \([L^2]/[T]\)
Important special case solution (here written in 1d):
where \(\sigma(t)=\sqrt{2{D}t}\)
density remains Gaussian
Gaussian becomes wider with time
check that this is indeed a solution by plugging into the diffusion equation!
def sigma(t, D = 1):
return np.sqrt(2*D*t)
def gaussian(x, t):
return 1/np.sqrt(2*np.pi*sigma(t)**2) * np.exp(-x**2/(2*sigma(t)**2)) #
@widgets.interact(t=(1,100,1))
def diffusion(t=0.001):
R = brown(T=101, N=1000)
x = np.linspace(-20, 20, 100)
plt.plot(x, gaussian(x, 1), '--', color='orange', label='t=0')
plt.plot(x, gaussian(x, t), lw=3, color='green', label=f't={t}')
plt.hist(R[:,t,0], density=True, alpha=0.6, label='simulation hist')
plt.legend()
plt.ylabel('$p(x)$')
plt.xlabel('$x$')
plt.xlim([-25, 25])
References¶
The mighty little books
More in depth
On the applied side